Skip to content

Add control and data plane HPA #3492

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 14 commits into
base: main
Choose a base branch
from
Open

Add control and data plane HPA #3492

wants to merge 14 commits into from

Conversation

nowjean
Copy link

@nowjean nowjean commented Jun 11, 2025

Proposed changes

Write a clear and concise description that helps reviewers understand the purpose and impact of your changes. Use the
following format:

Problem: I want NGF to work with a HorizontalPodAutoscaler

Solution: Add HPA for deployement

Testing: Describe any testing that you did.
I've deployed my AKS cluster and checked hpa working correctly.

Closes #3447

Checklist

Before creating a PR, run through this checklist and mark each as complete.

  • I have read the CONTRIBUTING doc
  • I have added tests that prove my fix is effective or that my feature works
  • I have checked that all unit tests pass after adding my changes
  • I have updated necessary documentation
  • I have rebased my branch onto main
  • I will ensure my PR is targeting the main branch and pulling from my branch from my own fork

Release notes

If this PR introduces a change that affects users and needs to be mentioned in the release notes,
please add a brief note that summarizes the change.

Add HPA for control and data plane deployment. The Data plane HPA using NginxProxy CRD resource.

Copy link

nginx-bot bot commented Jun 11, 2025

Hi @nowjean! Welcome to the project! 🎉

Thanks for opening this pull request!
Be sure to check out our Contributing Guidelines while you wait for someone on the team to review this.

@nginx-bot nginx-bot bot added the community label Jun 11, 2025
Copy link
Contributor

github-actions bot commented Jun 11, 2025

✅ All required contributors have signed the F5 CLA for this PR. Thank you!
Posted by the CLA Assistant Lite bot.

@github-actions github-actions bot added the helm-chart Relates to helm chart label Jun 11, 2025
@nowjean
Copy link
Author

nowjean commented Jun 11, 2025

I have hereby read the F5 CLA and agree to its terms

@sjberman sjberman moved this from 🆕 New to External Pull Requests in NGINX Gateway Fabric Jun 12, 2025
@salonichf5
Copy link
Contributor

Thank you for you contribution to the project. Please run make generate-all from root directory to auto generate schema files.

@nowjean nowjean requested a review from a team as a code owner June 14, 2025 01:31
@github-actions github-actions bot added the documentation Improvements or additions to documentation label Jun 14, 2025
@nowjean
Copy link
Author

nowjean commented Jun 14, 2025

Thank you for you contribution to the project. Please run make generate-all from root directory to auto generate schema files.

I’ve completed 'make generate-all'. Could you please review my PR?

@nowjean nowjean force-pushed the hpa branch 2 times, most recently from 172c009 to d081d68 Compare June 14, 2025 07:20
@sjberman
Copy link
Collaborator

sjberman commented Jun 17, 2025

So this only affects the control plane, correct? We probably want to support this for the nginx data plane as well (seems like that would be the more beneficial use case).

In order to configure deployment options for the data plane, it requires a bit more work, specifically in our APIs and the code itself. The NginxProxy CRD holds the deployment configuration for the nginx data plane, which the control plane uses to configure the data plane when deploying it.

Here is a simple example of how we add a new field to the API to allow for configuring these types of deployment fields: #3319.

@sjberman sjberman added the enhancement New feature or request label Jun 17, 2025
@sjberman
Copy link
Collaborator

I'd also love a more descriptive PR title, as well as a release note in the description so we can include this feature in our release notes :)

@nowjean
Copy link
Author

nowjean commented Jun 17, 2025

@sjberman Yes, this PR only affects the control plane. Can we also implement HPA for the data plane?

AFAIK, the data plane Deployment is created by the NginxProxy CRD, and its name depends on the Gateway's name parameter. This means we can't assume a fixed name in values.yaml.

HPA only applies to Deployments with a fixed name, like:

scaleTargetRef:
  apiVersion: apps/v1
  kind: Deployment
  name: my-app

So, I think we can't implement HPA via the Helm chart, especially since data plane and control plane pods are now separated in 2.0.
Do you have any suggestions or recommended approaches for enabling HPA on the data plane?

@sjberman
Copy link
Collaborator

@nowjean I updated my comment with a description on how it can be implemented on the data plane side. Glad we're on the same page :)

@salonichf5
Copy link
Contributor

Thank you for you contribution to the project. Please run make generate-all from root directory to auto generate schema files.

I’ve completed 'make generate-all'. Could you please review my PR?

Will manual test this PR for both control plane and data plane when we have all the changes :)

@github-actions github-actions bot removed the enhancement New feature or request label Jun 20, 2025
@nowjean
Copy link
Author

nowjean commented Jun 20, 2025

@sjberman @salonichf5 I've pushed my changes to this PR.
This is my first time working with Golang and creating CRDs, so I’d appreciate your review.

From my testing, the code correctly applies HPA to both the control plane and data plane.
To enable this, please set nginx.autoscaling.enabled: true and configure nginx.container.resources accordingly.
dataplane
controlplane

@salonichf5
Copy link
Contributor

salonichf5 commented Jun 20, 2025

Testing applying these HPA for control plane and data plane pods

  1. Control plane pod

values.yaml

  autoscaling:
    # Enable or disable Horizontal Pod Autoscaler
    enabled: true
    annotations: {
       example.com/test: "true" 
    }
    minReplicas: 2
    maxReplicas: 4
    targetCPUUtilizationPercentage: 60
    targetMemoryUtilizationPercentage: 70
    behavior: 
      scaleUp:
        stabilizationWindowSeconds: 100
        policies:
        - type: Pods
          value: 2
          periodSeconds: 20
      scaleDown:
        stabilizationWindowSeconds: 200
        policies:
        - type: Pods
          value: 1
          periodSeconds: 20
  autoscalingTemplate: 
  - type: Pods
    pods:
      metric:
        name: nginx_gateway_fabric_nginx_process_requests_total
      target:
        type: AverageValue
        averageValue: 10000m

HPA details

k describe horizontalpodautoscalers.autoscaling -n nginx-gateway ngf-nginx-gateway-fabric
Name:                                                     ngf-nginx-gateway-fabric
Namespace:                                                nginx-gateway
Labels:                                                   app.kubernetes.io/instance=ngf
                                                          app.kubernetes.io/managed-by=Helm
                                                          app.kubernetes.io/name=nginx-gateway-fabric
                                                          app.kubernetes.io/version=edge
                                                          helm.sh/chart=nginx-gateway-fabric-2.0.1
Annotations:                                              example.com/test: true
                                                          meta.helm.sh/release-name: ngf
                                                          meta.helm.sh/release-namespace: nginx-gateway
CreationTimestamp:                                        Fri, 20 Jun 2025 14:16:18 -0600
Reference:                                                Deployment/ngf-nginx-gateway-fabric
Metrics:                                                  ( current / target )
  resource memory on pods  (as a percentage of request):  <unknown> / 70%
  resource cpu on pods  (as a percentage of request):     <unknown> / 60%
Min replicas:                                             2
Max replicas:                                             4
Behavior:
  Scale Up:
    Stabilization Window: 100 seconds
    Select Policy: Max
    Policies:
      - Type: Pods  Value: 2  Period: 20 seconds
  Scale Down:
    Stabilization Window: 200 seconds
    Select Policy: Max
    Policies:
      - Type: Pods  Value: 1  Period: 20 seconds
Deployment pods:    2 current / 2 desired
Conditions:
  Type           Status  Reason                   Message
  ----           ------  ------                   -------
  AbleToScale    True    SucceededGetScale        the HPA controller was able to get the target's current scale
  ScalingActive  False   FailedGetResourceMetric  the HPA was unable to compute the replica count: failed to get memory utilization: unable to get metrics for resource memory: unable to fetch metrics from resource metrics API: the server could not find the requested resource (get pods.metrics.k8s.io)
Events:
  Type     Reason                        Age                 From                       Message
  ----     ------                        ----                ----                       -------
  Normal   SuccessfulRescale             11m                 horizontal-pod-autoscaler  New size: 2; reason: Current number of replicas below Spec.MinReplicas
  Warning  FailedGetResourceMetric       9m5s (x8 over 10m)  horizontal-pod-autoscaler  failed to get cpu utilization: unable to get metrics for resource cpu: unable to fetch metrics from resource metrics API: the server could not find the requested resource (get pods.metrics.k8s.io)
  Warning  FailedComputeMetricsReplicas  9m5s (x8 over 10m)  horizontal-pod-autoscaler  invalid metrics (2 invalid out of 2), first error is: failed to get memory resource metric value: failed to get memory utilization: unable to get metrics for resource memory: unable to fetch metrics from resource metrics API: the server could not find the requested resource (get pods.metrics.k8s.io)
  Warning  FailedGetResourceMetric       50s (x41 over 10m)  horizontal-pod-autoscaler  failed to get memory utilization: unable to get metrics for resource memory: unable to fetch metrics from resource metrics API: the server could not find the requested resource (get pods.metrics.k8s.io)

Needed to install the metrics server (enabling insecure TLS) to get metrics for resource memory and ScalingActive is false since i did not set memory request.

should this be communicated to end user about setting additional fields if we want scaling to be active

  1. Auto scaling set for both control plane and data plane pods

values.yaml


# NGINX Gateway Fabric

  resources:
    requests:
      cpu: "100m"
      memory: "128Mi"
    limits:
      cpu: "500m"
      memory: "256Mi"

  autoscaling:
    # Enable or disable Horizontal Pod Autoscaler
    enabled: true
    annotations: {
       example.com/test: "true" 
    }
    minReplicas: 2
    maxReplicas: 4
    targetCPUUtilizationPercentage: 60
    targetMemoryUtilizationPercentage: 70
    behavior: 
      scaleUp:
        stabilizationWindowSeconds: 100
        policies:
        - type: Pods
          value: 2
          periodSeconds: 20
      scaleDown:
        stabilizationWindowSeconds: 200
        policies:
        - type: Pods
          value: 1
          periodSeconds: 20
  autoscalingTemplate: 
  - type: Pods
    pods:
      metric:
        name: nginx_gateway_fabric_nginx_process_requests_total
      target:
        type: AverageValue
        averageValue: 10000m
        


#  NGINX Deployment
  autoscaling:
    # Enable or disable Horizontal Pod Autoscaler
    enabled: true
    hpaAnnotations: {
        cafe.example.com/test: "true"
    }
    minReplicas: 2
    maxReplicas: 4
    targetCPUUtilizationPercentage: 40
    targetMemoryUtilizationPercentage: 30
    behavior:
      scaleDown:
        stabilizationWindowSeconds: 100
        policies:
        - type: Pods
          value: 3
          periodSeconds: 20
      scaleUp:
        stabilizationWindowSeconds: 50
        policies:
        - type: Pods
          value: 1
          periodSeconds: 20



  container:
    # -- The resource requirements of the NGINX container. You should be set this value, If you want to use dataplane Autoscaling(HPA).
    resources:
      requests:
        cpu: 500m
        memory: 2Gi
k describe horizontalpodautoscalers.autoscaling -n nginx-gateway ngf-nginx-gateway-fabric
Name:                                                     ngf-nginx-gateway-fabric
Namespace:                                                nginx-gateway
Labels:                                                   app.kubernetes.io/instance=ngf
                                                          app.kubernetes.io/managed-by=Helm
                                                          app.kubernetes.io/name=nginx-gateway-fabric
                                                          app.kubernetes.io/version=edge
                                                          helm.sh/chart=nginx-gateway-fabric-2.0.1
Annotations:                                              example.com/test: true
                                                          meta.helm.sh/release-name: ngf
                                                          meta.helm.sh/release-namespace: nginx-gateway
CreationTimestamp:                                        Fri, 20 Jun 2025 14:56:42 -0600
Reference:                                                Deployment/ngf-nginx-gateway-fabric
Metrics:                                                  ( current / target )
  resource memory on pods  (as a percentage of request):  21% (29343744) / 70%
  resource cpu on pods  (as a percentage of request):     7% (7m) / 60%
Min replicas:                                             2
Max replicas:                                             4
Behavior:
  Scale Up:
    Stabilization Window: 100 seconds
    Select Policy: Max
    Policies:
      - Type: Pods  Value: 2  Period: 20 seconds
  Scale Down:
    Stabilization Window: 200 seconds
    Select Policy: Max
    Policies:
      - Type: Pods  Value: 1  Period: 20 seconds
Deployment pods:    2 current / 2 desired
Conditions:
  Type            Status  Reason            Message
  ----            ------  ------            -------
  AbleToScale     True    ReadyForNewScale  recommended size matches current size
  ScalingActive   True    ValidMetricFound  the HPA was able to successfully calculate a replica count from memory resource utilization (percentage of request)
  ScalingLimited  True    TooFewReplicas    the desired replica count is less than the minimum replica count
Events:
  Type     Reason                   Age    From                       Message
  ----     ------                   ----   ----                       -------
  Normal   SuccessfulRescale        5m40s  horizontal-pod-autoscaler  New size: 2; reason: Current number of replicas below Spec.MinReplicas
  Warning  FailedGetResourceMetric  5m25s  horizontal-pod-autoscaler  failed to get cpu utilization: did not receive metrics for targeted pods (pods might be unready)

I saw HPA get configured for control plane pod but i couldn't see one configured for data plane pod.

Events from the nginx deployment and logs could normal.

helm template ngf ./charts/nginx-gateway-fabric -f ./charts/nginx-gateway-fabric/values.yaml
# Source: nginx-gateway-fabric/templates/hpa.yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  annotations: 
    example.com/test: "true"
  labels:
    helm.sh/chart: nginx-gateway-fabric-2.0.1
    app.kubernetes.io/name: nginx-gateway-fabric
    app.kubernetes.io/instance: ngf
    app.kubernetes.io/version: "edge"
    app.kubernetes.io/managed-by: Helm
  name: ngf-nginx-gateway-fabric
  namespace: default
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: ngf-nginx-gateway-fabric
  minReplicas: 2
  maxReplicas: 4
  metrics:
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 70
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 60
  behavior:
    scaleDown:
      policies:
      - periodSeconds: 20
        type: Pods
        value: 1
      stabilizationWindowSeconds: 200
    scaleUp:
      policies:
      - periodSeconds: 20
        type: Pods
        value: 2
      stabilizationWindowSeconds: 100

The NginxProxy resource reflects resources value but not autoscaling fields

k describe nginxproxies.gateway.nginx.org -n nginx-gateway ngf-proxy-config
Name:         ngf-proxy-config
Namespace:    nginx-gateway
Labels:       app.kubernetes.io/instance=ngf
              app.kubernetes.io/managed-by=Helm
              app.kubernetes.io/name=nginx-gateway-fabric
              app.kubernetes.io/version=edge
              helm.sh/chart=nginx-gateway-fabric-2.0.1
Annotations:  meta.helm.sh/release-name: ngf
              meta.helm.sh/release-namespace: nginx-gateway
API Version:  gateway.nginx.org/v1alpha2
Kind:         NginxProxy
Metadata:
  Creation Timestamp:  2025-06-20T20:56:42Z
  Generation:          1
  Resource Version:    481755
  UID:                 6140ca30-cd48-4b13-b53f-a7e382f842db
Spec:
  Ip Family:  dual
  Kubernetes:
    Deployment:
      Container:
        Image:
          Pull Policy:  Never
          Repository:   nginx-gateway-fabric/nginx
          Tag:          sa.choudhary
        Resources:
          Requests:
            Cpu:     500m
            Memory:  2Gi
      Replicas:      1
    Service:
      External Traffic Policy:  Local
      Type:                     LoadBalancer
Events:                         <none>

So a couple of observations

  • you are not specifying scaleTargetRef in NGINXProxy resource
  • also, during helm install i am running into error that unknown field "spec.kubernetes.deployment.autoscaling" from the NginxProxy template. So API has been updated but maybe something up with how yaml is nested.
  • I think this could be a possible issue. I did spend some time debugging and adding target refs but that failed too
W0620 15:33:11.844029   43067 warnings.go:70] unknown field "spec.kubernetes.deployment.autoscaling"
W0620 15:33:11.844075   43067 warnings.go:70] unknown field "spec.scaleTargetRef"

What am I doing wrong in terms of testing ? @sjberman @nowjean

@nowjean
Copy link
Author

nowjean commented Jun 21, 2025

@salonichf5 @sjberman Thanks for testing!

Please refer to below guide and review my PR again.

I've patched Makefile generate-crds

crd:maxDescLen=0

This option off the description of CRDs. Because, new nginxProxy manifest file occurs

The CustomResourceDefinition "nginxproxies.gateway.nginx.org" is invalid: metadata.annotations: Too long: must have at most 262144 byte
  1. Rebuild the image
    I’ve updated the CRDs, so please rebuild the NGF Docker image:
    make generate-all.
    make build-ngf-image.
    (You’ll find this target in the Makefile.)

  2. Apply new NginxProxy CRD.

kubectl apply -f gateway.nginx.org_nginxproxies.yaml
  1. Point Helm to the new image
    Edit values.yaml so the Helm chart uses the image you just built:
 image:
   # -- The NGINX Gateway Fabric image to use
   repository: ghcr.io/nginx/nginx-gateway-fabric  # → replace with your new image
   tag: edge  # → replace with your new tag

(In my case, I had to upgrade my runc version to build ngf docker images.)

  1. Deploy the Helm chart
  • The control-plane HPA is created automatically by the Helm template.

  • The data-plane HPA is created by each Gateway resource, because every NginxProxy manages its own control-plane Deployment, Service, and so on. After you apply a Gateway, you’ll see an HPA for its Deployment.

End-users can create multiple Gateways, and each one needs its own HPA, so the logic now lives in the Gateway resource.

Plus, I'm not sure about this part:

Needed to install the metrics server (enabling insecure TLS) to get metrics for resource memory and ScalingActive is false >since i did not set memory request.
should this be communicated to end user about setting additional fields if we want scaling to be active

Normally, we assume that end users already have the Metrics Server running if they're using HPA or similar features. But maybe it's worth adding a note in the docs to avoid confusion.

@nowjean nowjean force-pushed the hpa branch 3 times, most recently from c57e992 to e8399d9 Compare June 24, 2025 09:01
@nowjean
Copy link
Author

nowjean commented Jul 24, 2025

@sjberman I rebased upstream main branch and resolved conflicts.

git rebase upstream/main
[resolve conflicts..]
git add .
git rebase --continue

and then,

make generate-all
make unit-test

git add .
git commit -m "blah blah blah~~"

It was super complicated task for me. Anyway, I think hpa branch rebased successfully and resolved all conflicts.
Please check this! Thanks.

@sjberman sjberman added the enhancement New feature or request label Jul 24, 2025
@@ -460,6 +466,49 @@ type DaemonSetSpec struct {
Patches []Patch `json:"patches,omitempty"`
}

// +kubebuilder:validation:XValidation:message="at least one metric must be specified when autoscaling is enabled",rule="!self.enabled || (has(self.targetCPUUtilizationPercentage) || has(self.targetMemoryUtilizationPercentage) || (has(self.autoscalingTemplate) && size(self.autoscalingTemplate) > 0))"
// +kubebuilder:validation:XValidation:message="minReplicas must be less than or equal to maxReplicas",rule="self.minReplicas <= self.maxReplicas"
// +kubebuilder:validation:XValidation:message="CPU utilization must be between 1 and 100",rule="!has(self.targetCPUUtilizationPercentage) || (self.targetCPUUtilizationPercentage >= 1 && self.targetCPUUtilizationPercentage <= 100)"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't need to use CEL validation. You can just set minimum and maximum directly on the field. See other int values as examples in here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This still needs to be fixed (and the comment below).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be specific, we just need to remove the validation for the values that specify max/min, and convert those to max/min kubebuilder tags. The other validation for "at least one metric" can stay.

// +kubebuilder:validation:XValidation:message="at least one metric must be specified when autoscaling is enabled",rule="!self.enabled || (has(self.targetCPUUtilizationPercentage) || has(self.targetMemoryUtilizationPercentage) || (has(self.autoscalingTemplate) && size(self.autoscalingTemplate) > 0))"
// +kubebuilder:validation:XValidation:message="minReplicas must be less than or equal to maxReplicas",rule="self.minReplicas <= self.maxReplicas"
// +kubebuilder:validation:XValidation:message="CPU utilization must be between 1 and 100",rule="!has(self.targetCPUUtilizationPercentage) || (self.targetCPUUtilizationPercentage >= 1 && self.targetCPUUtilizationPercentage <= 100)"
// +kubebuilder:validation:XValidation:message="memory utilization must be between 1 and 100",rule="!has(self.targetMemoryUtilizationPercentage) || (self.targetMemoryUtilizationPercentage >= 1 && self.targetMemoryUtilizationPercentage <= 100)"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment

@github-actions github-actions bot removed the enhancement New feature or request label Jul 28, 2025
@sjberman
Copy link
Collaborator

@nowjean We're looking at doing a release in the next couple of weeks, and would love to include these changes. If you're schedule is too busy to focus on this, that's totally okay. Let us know, and we can also look into taking over the remaining pieces of this work.

@nowjean
Copy link
Author

nowjean commented Jul 30, 2025

@sjberman Cool, I think 2 or 3 weeks should do it. I'm on it!

@sjberman
Copy link
Collaborator

Cool, I think 2 or 3 weeks should do it. I'm on it!

@nowjean Thanks, we would need this merged in by the end of next week. If that's not enough time for you, let me know and we can take over.

@nowjean
Copy link
Author

nowjean commented Aug 4, 2025

Cool, I think 2 or 3 weeks should do it. I'm on it!

@nowjean Thanks, we would need this merged in by the end of next week. If that's not enough time for you, let me know and we can take over.

I have pushed the feature. Please check it. It seems to be working correctly in my cluster.

@michasHL
Copy link
Contributor

michasHL commented Aug 4, 2025

Sorry to butt in this PR but I just found this effort and have to say I really like what I'm seeing.
One of our existing services scales an Nginx Ingress Controller using Keda by utilizing a Prometheus query based on the number of active connections.

Keda itself likes to create and manage the HPA so it can set the scale target etc by itself.
I understand that this specific effort will create an HPA resource and prevent the NginxProxy from getting in the way of the HPA scaler.

Is there a way to incorporate a change either here or as a follow-up that basically would remove the necessity to provide the original number of replicas and turn off the NginxProxy-led scaling, so we can utilize Keda for it? This means the ScaledObject would point to the "deployment" and does the scaling?

If that is not an option, I believe we can utilize the option to transfer ownership of the HPA to Keda to still make it work as long as we know the name of the HPA.

@nowjean
Copy link
Author

nowjean commented Aug 5, 2025

@michasHL
Thanks for your opinion. However, AFAIK, KEDA is an event-driven autoscaler that still uses the standard HPA mechanism under the hood. So the core autoscaling logic remains the same as HPA.

Also, the NGINX Ingress Controller Helm chart provides options for either HPA or KEDA. Additionally, KEDA requires a separate controller to be installed, whereas HPA is built into Kubernetes by default. This means KEDA is not a drop-in replacement for HPA.

Therefore, I think if we need KEDA-specific features, we should implement them in a follow-up after this PR is merged.
If I’ve misunderstood anything, please feel free to correct me.

Copy link
Collaborator

@sjberman sjberman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still need #3492 (comment) and #3492 (comment) to be addressed.

Also need a rebase on main (and any conflicts fixed). Almost done, nice work!

@@ -615,7 +615,24 @@ func (p *NginxProvisioner) buildNginxDeployment(
}

if deploymentCfg.Replicas != nil {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if this replicas field is nil (which happens when a user doesn't specify anything in the NginxProxy resource for this field), we should still perform the HPA logic you added below.

So basically you'll have

var replicas *int32
if deploymentCfg.Replicas != nil {
   replicas = deploymentCfg.Replicas
}

if isAutoscalingEnabled...

deployment.Spec.Replicas = replicas

if err == nil && hpa.Status.DesiredReplicas > 0 {
// overwrite with HPA's desiredReplicas
replicas = helpers.GetPointer(hpa.Status.DesiredReplicas)
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we also add a unit test for this case?

@michasHL
Copy link
Contributor

michasHL commented Aug 5, 2025

@michasHL Thanks for your opinion. However, AFAIK, KEDA is an event-driven autoscaler that still uses the standard HPA mechanism under the hood. So the core autoscaling logic remains the same as HPA.

Also, the NGINX Ingress Controller Helm chart provides options for either HPA or KEDA. Additionally, KEDA requires a separate controller to be installed, whereas HPA is built into Kubernetes by default. This means KEDA is not a drop-in replacement for HPA.

Therefore, I think if we need KEDA-specific features, we should implement them in a follow-up after this PR is merged. If I’ve misunderstood anything, please feel free to correct me.

Absolutely, didn't want to diminish any work you've already done on this.
You're right in that Keda needs an additional controller and that is a more niche use case. In order for Keda to work it needs to "own" the HPA, that means it needs to drive the number of desired replicas based and its own enhanced evaluation (e.g. using active connections via Prometheus). Normally we accomplish that by replacing the HPA k8s resource with a ScaledObject custom resource, which under the hood creates an HPA.

Lucky for us, Keda introduced the concept of taking ownership of an existing HPA as long we know the name of it.

So, I think our use-case will be covered by what you already have.

Keep up the good work and I'm looking forward to seeing this change in the next release :)

@sjberman
Copy link
Collaborator

sjberman commented Aug 5, 2025

@michasHL By owning the HPA, does KEDA need to directly edit its config? Just want to clarify, since that would require some thought on how to implement, since right now every resource we provision is owned by our control plane, and can't be edited out of band (meaning that KEDA wouldn't be able to edit the HPA).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
community documentation Improvements or additions to documentation helm-chart Relates to helm chart release-notes
Projects
Status: External Pull Requests
Development

Successfully merging this pull request may close these issues.

HorizontalPodAutoscaler Support
7 participants